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This article describes a recently introduced transform algorithm called the integer 
cosine transform (ICT), which is used in transform-based data compression schemes. 
The ICT algorithm requires only integer operations on small integers and at the 
same time gives a rate-distortion performance comparable to that offered by the 
floating-point discrete cosine transform (DCT). The article addresses the issue of 
implementation complexity, which is of prime concern for source coding applications 
of interest in deep-space communications. Complexity reduction in the transform 
stage of the compression scheme is particularly relevant, since this stage accounts 
for most (typically over 80 percent ) of the computational load. 


I. Introduction 

The rate-distortion performance of three transform- 
based coding schemes used to compress the test images for 
the Comet Rendezvous Asteroid Flyby (CRAF)/Cassini 
Project was presented in [1], More recently, the issue 
of implementation complexity, which is of prime concern 
to spacecraft applications, was addressed. The compu- 
tational bottleneck of transform-based algorithms lies in 
the front-end transform stage, which accounts for over 
80 percent of the computational load of these compres- 
sion schemes. This article describes a recently introduced 
transform algorithm called the integer cosine transform 
(ICT), which requires only integer operations on small in- 
tegers and at the same time has rate-distortion compa- 
rable to that of the floating-point discrete cosine trans- 
form (DCT), which is the most practical and near optimal 
approach known for data compression. The implementa- 
tion complexity of the ICT is substantially lower than that 
of the DCT, and is comparable to that of the Hadamard 
transform (HT). The ICT is a practical approach to achiev- 
ing the high-rate deep-space communications that are pos- 
sible with the DCT. 


II. Background: Transform-Based Schemes 

In preparing the test images for the CRAF/Cassini 
Project, three transform-based encoding algorithms were 
used to compress a set of seven planetary images [1], which 
are continuous-tone gray-scale, with pixel values ranging 
from 0 (black) to 255 (white). AH three algorithms can be 
viewed as consisting of three stages, as illustrated in Fig. 1: 
the data transform stage, the quantization stage, and the 
entropy-coding stage. The compression algorithms work 
on a block-by-block basis, i.e., they compress an 8 x 8 
block of the picture at a time. In each algorithm, the 
encoder first applies an 8 x 8 floating-point DCT or an 
8x8 HT to the picture block to generate an 8 X 8 block of 
transform coefficients. These numbers are then quantized 
by a predetermined 8x8 quantization template to inte- 
ger values. Most quantized values have small magnitudes. 
Due to the skewed distribution of the quantized transform 
coefficients, compression is achieved by assigning shorter 
transmission-bit patterns to the more frequently occurring 
integers. This is realized in the last stage of the compres- 
sion scheme, the entropy coder, which maps the quantized 
values to appropriate transmission-bit patterns. 
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In the CRAF/Cassini data compression experiment, all 
three transform-based schemes used the same DCT or HT 
(stage 1) and the same quantization template (stage 2). 
The difference lies in the choice of entropy coder in the 
third stage, where one may use the Joint Photographic 
Expert Group (JPEG) Huffman code [2,12], an arithmetic 
code [2] , or the Gallager^van Voorhis-Huffman (GVH) 
code [3]. In general, the DCT-based schemes are more 
effective (0,1 to 0.3 bits per pixel) them the HT-based 
schemes, especially in the high bit rate (near lossless) 
range. However, the more effective DCT-based schemes 
are more computationally intensive than the HT-based 
schemes. The major computational burden of DCT-based 
schemes lies in the DCT stage, which requires a large num- 
ber of floating-point multiplications and additions. HT- 
based schemes, on the other hand, require only integer 
additions and subtractions in the transform stage. From 
the hardware’s point of view, floating-point operations are 
much slower and more difficult to implement than the 
corresponding integer operations. For a general TV-point 
DCT, a straightforward algorithm [4] that yields a simple 
regular implementation and a small chip size requires 27V 3 
multiplications and 27V 3 additions. A more sophisticated 
TV-pOint fast DCT [5], where TV has a power of 2, that uses 
complex data-shuffling strategies still requires TV 2 log 2 TV 
multiplications and 7V(37V log 2 TV — TV -f 1) additions. The 
large number of floating-point operations required to per- 
form DCT, particularly for large TV, is the computational 
bottleneck for all DCT-based signal-processing schemes. 


cients were quantized by using the same quantization tem- 
plate as in the aforementioned DCT-based schemes. The 
entropy of the quantized transform coefficients and the 
mean square error (MSE) of the reconstructed picture were 
computed, and the results are shown in Fig. 3. These simu- 
lation results indicate that any difference in rate-distortion 
performance resulting from using the floating-point DCT 
or the ICT is unnoticeable. 

Although the 8-point ICT proposed by Choy, Cham, 
and Lee performs remarkably well, it is quite ad hoc, and 
no general mathematical formulation of the ICT is given 
in [6]. The contributions of this article are to put the ICT 
into a more formal mathematical setting, and to gener- 
alize their idea to any TV-point ICT. The mathematical 
properties of the ICT are investigated in the following sec- 
tions. Since the ICT is separable, and the extension of 
the one-dimensional ICT to two dimensions is straight- 
forward, this article focuses on the one-dimensional case. 
Section IV gives a characterization of ICT matrices. An 
8x8 ICT matrix that is multiplication-free and requires 
only binary additions and shifts is given in Section V (the 
MSE versus entropy performance of the multiplication-free 
ICT, the original ICT of [6], and that of the floating-point 
DCT are shown in Fig. 3). A general procedure for the 
construction of an TV x TV ICT matrix is obtained in Sec- 
tion VI; and two 16 x 16 ICT matrices, one with only 
small integer entries and one with all entries’ powers of 
two (multiplication-free) are exhibited in Section VII. 


III. Integer Cosine Transform 

Recently Choy, Cham, and Lee [6] proposed a new 
8-pOint transform called the integer cosine transform 
(ICT), which requires only integer multiplications and ad- 
ditions, and thus is much simpler to implement than the 
DCT, An ICT chip was fabricated and was proven to be 
efficient in both silicon area and speed [6]. The 8x8 ICT 
matrix suggested in [6] is given in Fig. 2(c). Notice that the 
elements in the matrix are all integers, and the ICT ma- 
trix B in Fig. 2(b) has sign and magnitude patterns that 
resemble those of the DCT matrix A in Fig. 2(a). The 
similarity of the ICT matrix to the DCT matrix, together 
with the orthogonality property of the ICT (BB* = A, 
where A is a diagonal matrix), guarantees that the ICT, 
as well as its inverse, possesses the same transform struc- 
ture as the DCT. Thus, any fast DCT algorithms can be 
used to compute a fast ICT. 


This 8x8 ICT matrix was used to compute a two- 
dimensional 8x8 transform and then compress the plan- 
etary images satuml and satumS. The transform coeffi- 


IV. Mathematical Properties of the ICT 

The integer cosine transform and the discrete cosine 
transform are closely related. Let C and A be the re- 
spective ICT and DCT TV x TV matrices. An orthonormal 
matrix (i.e., AA‘ = /), A = [a*,,], is defined as follows for 
0 < n < TV — 1: 


O-kn = 


1 


k = 0 



7r(2n + l)fc 
27V 


1 < * < TV — 1 (1) 


Using A as a template, the ICT matrix C = [ct n ] is an 
orthogonal matrix (i.e., CC* = A, where A is a diagonal 
matrix) with the following properties: 

(1) Integer property: c* n represents integers for 0 < k, 
n < TV — 1. 

(2) Orthogonality property: Rows (or columns) of C are 
orthogonal. 


46 



(3) Relationship with DCT: 

(a) sgn(ckn ) = sgn(a.k n ) for 0 < k, n < N — 1. 

(b) If at„ = a, t , then ct„ = c,< for 0 < k,n,s,t < 
N — 1. 

The integer property eliminates real multiplication and 
real addition operations. The orthogonality property as- 
sures that the inverse ICT has the same transform struc- 
ture as the ICT. Notice that C is only required to be or- 
thogonal, but not orthonormal. However, any orthogonal 
matrix can be made orthonormal by multiplying it by an 
appropriate diagonal matrix. This operation can be in- 
corporated in the quantization (dequantization) stage of 
the compression (decompression) scheme, thus sparing the 
ICT (inverse ICT) from floating-point operations, and at 
the same time preserving the same transform structure as 
in the floating-point DCT (inverse DCT). The relationship 
between ICT and DCT guarantees efficient energy packing 
and allows the use of any fast DCT technique for the ICT. 


V. ICT for N =8 

The floating-point 8x8 DCT matrix is shown in 
Fig. 2(a). A general structure of the 8x8 ICT matrix 
is given in Fig. 2(b). The symbols a, 6, c, d, e, and / in 
Fig. 2(b) are numbers that satisfy conditions (1) through 
(3) given in Section IV. It was suggested in {6] to use a — 5, 
b = 3, c = 2, d = 1, e = 3, and / = 1 for the N = 8 
ICT, as shown in Fig. 2(c). There are many other sets of 
(a, 6, c, d, e, /) that can generate an orthogonal ICT. The 
integer set (a, b, c, d, e, /) gives an orthogonal ICT if, and 
only if, CC* is a diagonal matrix. This is equivalent to the 
requirement 

ab — ac — bd — cd = 0 (2) 

with e and / arbitrary. The integer set (4, 2, 2, 0,4, 2) sat- 
isfies Eq. (2) and the corresponding ICT matrix is given 
in Fig. 2(d). Notice that the integers chosen are all pow- 
ers of 2, and thus only simple binary additions and shift 
operations are required for this ICT. The MSE versus en- 
tropy performance of the compression scheme using this 
multiplication-free ICT is shown in Fig. 3. In view of the 
particular choice of the integers in the multiplication-free 
implementation, one expects the performance of this ICT 
to be inferior to that of the floating-point DCT and the 
ICT of [6j. However, the difference in the performance is 
small, as shown in Fig. 3. 


VI. A General Procedure for Constructing an 
ICT Matrix 

A general procedure to construct an N x N ICT matrix 
is presented in this section. For any N x N ICT matrix, 
this construction is done on the ground prior to implement 
tation. The DCT matrix is used as a template to generate 
an ICT matrix. The procedure is as follows: 

(1) Generate the N x N DCT matrix A, 

(2) Construct an N x N matrix B by substituting the N 
possible absolute values in A with N symbols, and 
preserve the signs of the elements in A- 

(3) Evaluate BB‘, and generate a set of independent 
algebraic equations that force BB 1 to be a diagonal 
matrix. 

(4) Find a set of N numbers that satisfy the set of alge- 
braic equations generated in (3), 

Since for a given N , there are N(N — 1) nondiagonal 
elements in C, part (3) of the procedure gives N(N — l)/2 
quadratic equations. This set of equations is too large to 
be handled easily except for small N. However, by set- 
ting the most frequently occurring symbol in (S' to be an 
integer such as 1 or 2, the number of independent equa- 
tions decreases substantially. As shown above, when N = 
8, the number of equations is reduced from 28 to 1. The 
most tedious part of the above procedure is part (4), that 
is, finding N integers that satisfy the set of nonlinear .al- 
gebraic equations generated in part (3). By using such 
advanced symbolic manipulation tools as M athemaiica [7], 
the effort of generating the set of algebraic equations in 
part (3) and solving them in part (4) can be greatly re- 
duced. In fact, M athemaiica was used in an interactive 
manner to generate the .8 x 8 and 16 x 16 ICT matrices 
given in this article. 

In order to obtain good compression performance, the 
set of N — 1 integers must have a magnitude profile similar 
to the N — 1 floating-point elements of A • Furthermore, 
if the multiplication-free property is desired, the set of 
N integers must be restricted to powers of 2. Some ad 
hoc techniques are usually needed to simplify the above 
calculations. 

Note also that there is a general procedure for approxi- 
mating an orthonormal matrix arbitrarily closely to one 
with rational coefficients. Given an orthonormal matrix Q 
with no eigenvalue equal to —1, let S — (/ — Q)(I + Q )~ 1 • 
Then S is skew symmetric since 

5 + S' = (I - Q)(I + Q)- 1 + (1 4 - QYHl - Q') 
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jk — jl — km — Im = 0 


(7) 


= (i - Q)(J + QT l + (7 + Q _1 ) _1 (7 - Q -1 ) 

= (/ - Q)(/ + Q)" 1 + (l + QY\Q - I) 

= 0 (3) 

Conversely, if S is any skew-symmetric matrix such that 
— 1 is not an eigenvalue, then by essentially the same com- 
putation, the matrix Q = (I-S)(I + 5) _1 is orthonormal. 
Thus, given an orthonormal matrix Q, one can approxi- 
mate S = (I — Q)(I + Q)~ l by an arbitrarily close rational 
skew-symmetric matrix S' . Then, Q' = (I — S')(I + S') -1 
is a rational orthonormal matrix close to Q. While this 
procedure works well in theory, there are practical diffi- 
culties in its application. In practice, one considers A Q' 
where A is an integer so that A Q' is integral. The matrix 
A Q' obtained by this procedure has generally large entries, . 
which makes it unsuitable for many applications. 

VII. ICTfor N= 16 

In this section, the general procedure of Section VI is 
used to construct a 16 X 16 ICT matrix. From Eq. (1), one 
obtains the 16 x 16 DCT matrix A. Notice that there axe 
16 non-negative values in A. The 16 x 16 ICT matrix B 
shown in Fig. 4(a) is obtained by using A as a template. 
By setting a = 1 and forcing all nondiagonal elements 
in BB X to be zero, one obtains the following set of four 
independent nonlinear equations: 

be + cf — df — bg — eg — eh + di — hi = 0 (4) 

bd — ce — de + fg + bh — fh + ci + gi = 0 (5) 

— cd + be + bf — eg - dh + gh + ei — fi = 0 (6) 


Notice that n and o are arbitrary. By extensive search 
on the sets of numbers that satisfy Eqs. (4), (5), (6), and 
(7), the following solution, which has a magnitude profile 
similar to that of the IV — 1 floating-point elements of the 
16 x 16 DCT matrix, was obtained: a = 1, b — 18, c = 18, 
d = 16, e = 14, / = 14, g = 7, h = 10, i = 2, j = 10, 
k = 9, / = 6, m = 2, n = 56, and o = 2. The corresponding 
ICT matrix is shown in Fig. 4(b). Another solution to the 
above system of equations is a = 1, b = 4, c = 4, d = 0, 
e = 2, / = 2, g = 4, h = 0, i = 0, j - 4, k = 2, 
l — 2, m = 0, n = 1, and o = 4. This matrix is shown 
in Fig. 4(c). Notice that all. integers in this solution are 
powers of 2, so only binary shift and addition operations 
are required for this ICT transform. Since there axe many 
zeros in this solution, one does not expect it to give an 
efficient ICT matrix. Intuitively, a transform matrix with 
good energy compaction should not have many zeros. 


VIII. Conclusion 

This article explored the mathematical properties of a 
new class of integer transforms called the integer cosine 
transform and derived a general construction procedure 
for this transform. This procedure can be used to con- 
struct integer versions of other transforms, such as the 
Fourier [8], sine [9], Gabor [10], wavelet [11], and so forth. 
The basic idea is to approximate a floating-point transform 
with its integer counterpart in the hope of achieving com- 
parable performance with much-reduced implementation 
complexity. In the case of the discrete cosine transform, 
its integer counterpart, the ICT, has an implementation 
complexity substantially lower than that of the DCT and 
comparable to that of the Hadamard transform. Simu- 
lation results indicate that rate-distortion performance of 
the ICT is only slightly inferior to that of the DCT. 
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Fig. 1. DCT-based compression system. 




















































